Scalable Chi-Square Distance versus Conventional Statistical Distance for Process Monitoring with Uncorrelated Data Variables

نویسندگان

  • Nong Ye
  • Connie M. Borror
  • Darshit Parmar
چکیده

Multivariate statistical process control charts are often used for process monitoring to detect out-of-control anomalies. However, multivariate control charts based on conventional statistical distance measures, such as the one used in the Hotelling’s T 2 control chart, cannot scale up to large amounts of complex process data, e.g. data with a large number of variables and a high rate of data sampling. In our previous work we developed a multivariate statistical process monitoring procedure based on a more scalable chi-square distance measure and tested this procedure for detecting out-of control anomalies—intrusions—in a computer process using computer audit data. The testing results demonstrated the comparable performance of the scalable chi-square procedure to that of Hotelling’s T 2 control chart. To establish the chi-square procedure as a generic, viable multivariate statistical processing monitoring procedure, we conduct a series of further studies to understand the detection power and limitations of the chi-square procedure for processes with various kinds of data and various types of out-of-control anomalies in addition to the scalability and demonstrated performance of the chi-square procedure for computer intrusion detection. This paper reports on one of these studies that investigates the effectiveness of the scalable chi-square procedure in detecting out-of-control anomalies in processes with uncorrelated data variables, each of which has a normal probability distribution. The results of this study indicate that the chi-square procedure is at least as effective as Hotelling’s T 2 control chart for monitoring processes with uncorrelated data variables. Copyright c © 2003 John Wiley & Sons, Ltd.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid SPC Method with the Chi-Square Distance Monitoring Procedure for Large-scale, Complex Process Data

Standard multivariate statistical process control (SPC) techniques, such as Hotelling’s T 2, cannot easily handle large-scale, complex process data and often fail to detect out-of-control anomalies for such data. We develop a computationally efficient and scalable Chi-Square (χ2) Distance Monitoring (CSDM) procedure for monitoring large-scale, complex process data to detect out-of-control anoma...

متن کامل

Modeling of Optimal Distances of Surveying Stations for Monitoring Astable Engineering Structures Using Geodetic Methods

Large and astable engineering structures are the great importance and the behavior of such structures is usually made in Geotechnical and Geodetic methods (engineering geodesy). In the geodetic methods, the selection of appropriate surveying stations, in terms of distribution, locations, and distant of the target points to stations the control network, have a significant role in determining th...

متن کامل

Pedagogical monitoring as a tool to reduce dropout in distance learning in family health

BACKGROUND This paper presents the results of a study of the Monsys monitoring system, an educational support tool designed to prevent and control the dropout rate in a distance learning course in family health. Developed by UNA-SUS/UFMA, Monsys was created to enable data mining in the virtual learning environment known as Moodle. METHODS This is an exploratory study using documentary and bib...

متن کامل

A Comparative Study of Hard and Fuzzy Data Clustering Algorithms with Cluster Validity Indices

Data clustering is one of the important data mining methods. It is a process of finding classes of a data set with most similarity in the same class and most dissimilarity between different classes. The well known hard clustering algorithm (K -means) and Fuzzy clustering algorithm (FCM) are mostly based on Euclidean distance measure. In this paper, a comparative study of these algorithms with d...

متن کامل

On the multivariate variation control chart

Multivariate control charts such as Hotelling`s T^ 2 and X^ 2 are commonly used for monitoring several related quality characteristics. These control charts use correlation structure that exists between quality characteristics in an attempt to improve monitoring. The purpose of this article is to discuss some issues related to the G chart proposed by Levinson et al. [9] for detecting shifts in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002